28 research outputs found

    The Curious Case of the PDF Converter that Likes Mozart: Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps

    Get PDF
    Third party apps that work on top of personal cloud services such as Google Drive and Dropbox, require access to the user's data in order to provide some functionality. Through detailed analysis of a hundred popular Google Drive apps from Google's Chrome store, we discover that the existing permission model is quite often misused: around two thirds of analyzed apps are over-privileged, i.e., they access more data than is needed for them to function. In this work, we analyze three different permission models that aim to discourage users from installing over-privileged apps. In experiments with 210 real users, we discover that the most successful permission model is our novel ensemble method that we call Far-reaching Insights. Far-reaching Insights inform the users about the data-driven insights that apps can make about them (e.g., their topics of interest, collaboration and activity patterns etc.) Thus, they seek to bridge the gap between what third parties can actually know about users and users perception of their privacy leakage. The efficacy of Far-reaching Insights in bridging this gap is demonstrated by our results, as Far-reaching Insights prove to be, on average, twice as effective as the current model in discouraging users from installing over-privileged apps. In an effort for promoting general privacy awareness, we deploy a publicly available privacy oriented app store that uses Far-reaching Insights. Based on the knowledge extracted from data of the store's users (over 115 gigabytes of Google Drive data from 1440 users with 662 installed apps), we also delineate the ecosystem for third-party cloud apps from the standpoint of developers and cloud providers. Finally, we present several general recommendations that can guide other future works in the area of privacy for the cloud

    Design space analysis for modeling incentives in distributed systems

    Get PDF
    Distributed systems without a central authority, such as peer-to-peer (P2P) systems, employ incentives to encourage nodes to follow the prescribed protocol. Game theoretic analysis is often used to evaluate incentives in such systems. However, most game-theoretic analyses of distributed systems do not adequately model the repeated interactions of nodes inherent in such systems. We present a game-theoretic analysis of a popular P2P protocol, Bit-Torrent, that models the repeated interactions in such protocols. We also note that an analytical approach for modeling incentives is often infeasible given the complicated nature of most deployed protocols. In order to comprehensively model incentives in complex protocols, we propose a simulation-based method, which we call Design Space Analysis (DSA). DSA provides a tractable analysis of competing protocol variants within a detailed design space. We apply DSA to P2P file swarming systems. With extensive simulations we analyze a wide-range of protocol variants and gain insights into their robustness and performance. To validate these results and to demonstrate the efficacy of DSA, we modify an instrumented BitTorrent client and evaluate protocols discovered using DSA. We show that they yield higher system performance and robustness relative to the reference implementation

    Data-Driven Privacy Indicators

    Get PDF
    Third party applications work on top of existing platforms that host users’ data. Although these apps access this data to provide users with specific services, they can also use it for monetization or profiling purposes. In practice, there is a significant gap between users’ privacy expectations and the actual access levels of 3rd party apps, which are often over-privileged. Due to weaknesses in the existing privacy indicators, users are generally not well-informed on what data these apps get. Even more, we are witnessing the rise of inverse privacy: 3rd parties collect data that enables them to know information about users that users do not know, cannot remember, or cannot reach. In this paper, we describe our recent experiences with the design and evaluation of Data-Driven Privacy Indicators (DDPIs), an approach attempting to reduce the aforementioned privacy gap. DDPIs are realized through analyzing user’s data by a trusted party (e.g., the app platform) and integrating the analysis results in the privacy indicator’s interface. We discuss DDPIs in the context of 3rd party apps on cloud platforms, such as Google Drive and Dropbox. Specifically, we present our recent work on Far-reaching Insights, which show users the insights that apps can infer about them (e.g., their topics of interest, collaboration and activity patterns etc.). Then we present History-based insights, a novel privacy indicator which informs the user on what data is already accessible by an app vendor, based on previous app installations by the user or her collaborators. We further discuss future ideas on new DDPIs, and we outline the challenges facing the wide-scale deployment of such indicators

    Adversarial Branch Architecture Search for Unsupervised Domain Adaptation

    Get PDF
    Unsupervised Domain Adaptation (UDA) is a key issue in visual recognition, as it allows to bridge different visual domains enabling robust performances in the real world. To date, all proposed approaches rely on human expertise to manually adapt a given UDA method (e.g. DANN) to a specific backbone architecture (e.g. ResNet). This dependency on handcrafted designs limits the applicability of a given approach in time, as old methods need to be constantly adapted to novel backbones. Existing Neural Architecture Search (NAS) approaches cannot be directly applied to mitigate this issue, as they rely on labels that are not available in the UDA setting. Furthermore, most NAS methods search for full architectures, which precludes the use of pre-trained models, essential in a vast range of UDA settings for reaching SOTA results. To the best of our knowledge, no prior work has addressed these aspects in the context of NAS for UDA. Here we tackle both aspects with an Adversarial Branch Architecture Search for UDA (ABAS): i. we address the lack of target labels by a novel data-driven ensemble approach for model selection; and ii. we search for an auxiliary adversarial branch, attached to a pre-trained backbone, which drives the domain alignment. We extensively validate ABAS to improve two modern UDA techniques, DANN and ALDA, on three standard visual recognition datasets (Office31, Office-Home and PACS). In all cases, ABAS robustly finds the adversarial branch architectures and parameters which yield best performances.Comment: Accepted at WACV 202

    StoreSim: Optimizing Information Leakage in Multicloud Storage Services

    Get PDF
    Many schemes have been recently advanced for storing data on multiple clouds. Distributing data over different cloud storage providers (CSPs) automatically provides users with a certain degree of information leakage control, as no single point of attack can leak all user's information. However, unplanned distribution of data chunks can lead to high information disclosure even while using multiple clouds. In this paper, to address this problem we present StoreSim, an information leakage aware storage system in multicloud. StoreSim aims to store syntactically similar data on the same cloud, thus minimizing the user's information leakage across multiple clouds. We design an approximate algorithm to efficiently generate similarity-preserving signatures for data chunks based on MinHash and Bloom filter, and also design a function to compute the information leakage based on these signatures. Next, we present an effective storage plan generation algorithm based on clustering for distributing data chunks with minimal information leakage across multiple clouds. Finally, we evaluate our scheme using two real datasets from Wikipedia and GitHub. We show that our scheme can reduce the information leakage by up to 60\% compared to unplanned placement

    CoShare: A Cost-effective Data Sharing System for Data Center Networks

    Get PDF
    Numerous research groups and other organizations collect data from popular data sources such as online social networks. This leads to the problem of data islands, wherein all this data is isolated and lying idly, without any use to the community at large. Using existing centralized solutions such as Dropbox to replicate data to all interested parties is prohibitively costly, given the large size of datasets. A practical solution is to use a Peer-to-Peer (P2P) approach to replicate data in a self-organized manner. However, existing P2P approaches focus on minimizing downloading time without taking into account the bandwidth cost. In this paper, we present CoShare, a P2P inspired decentralized cost effective sharing system for data replication. CoShare allows users to specify their requirements on data sharing tasks and maps these requirements into resource requirements for data transfer. Through extensive simulations, we demonstrate that CoShare finds the desirable tradeoffs for a given cost and performance while varying user requirements and request arrival rates

    Optimizing Information Leakage in Multicloud Storage Services

    Get PDF
    Many schemes have been recently advanced for storing data on multiple clouds. Distributing data over multiple cloud storage providers automatically provides users with a certain degree of information leakage control, for no single point of attack can leak all the information. However, unplanned distribution of data chunks can lead to high information disclosure even while using multiple clouds. In this paper, we study an important information leakage problem caused by unplanned data distribution in multicloud storage services. Then, we present StoreSim, an information leakage aware storage system in multicloud. StoreSim aims to store syntactically similar data on the same cloud, thus minimizing the user's information leakage across multiple clouds. We design an approximate algorithm to efficiently generate similarity-preserving signatures for data chunks based on MinHash and Bloom filter, and also design a function to compute the information leakage based on these signatures. Next, we present an effective storage plan generation algorithm based on clustering for distributing data chunks with minimal information leakage across multiple clouds. Finally, we evaluate our scheme using two real datasets from Wikipedia and GitHub. We show that our scheme can reduce the information leakage by up to 60% compared to unplanned placement. Furthermore, our analysis on system attackability demonstrates that our scheme makes attacks on information more complex

    IslamGRID: Knowledge Management at Jakim

    Get PDF
    IslamGRID was developed to collect, preserve, disseminate and promote knowledge of Islam to all Muslims globally, whether in the administrative, industrial and commercial. With the advent of Information Technology and the increase of the amount of information that organizations have to deal with, there is a requirement to put strategic process in place to ensure the effective management of knowledge. This case study research looks into 3 main areas of a Knowledge Management (KM) System (KMS) (i. People; ii. Process; and iii. Technology) and their relationship, which might affect the effectiveness of the IslamGRID to the Islamic community (Ummah). The application of these areas would be for JAKIM, JAIN, Muslims as well as non-Muslims

    Data-Driven Privacy Indicators

    Get PDF
    ABSTRACT Third party applications work on top of existing platforms that host users' data. Although these apps access this data to provide users with specific services, they can also use it for monetization or profiling purposes. In practice, there is a significant gap between users' privacy expectations and the actual access levels of 3rd party apps, which are often over-privileged. Due to weaknesses in the existing privacy indicators, users are generally not well-informed on what data these apps get. Even more, we are witnessing the rise of inverse privacy: 3rd parties collect data that enables them to know information about users that users do not know, cannot remember, or cannot reac
    corecore